Important: The functionality described in this notebook is not part of the current package release. To access it, you need to obtain the code from the line-finder branch and install the package from source.
This tool provides the list of extrema in the internally calibrated mean spectra. No evaluation of spectra in both sampled and continuous forms is performed.
# Import the tool
from gaiaxpy import find_fast
The available input types are: a pandas DataFrame, an ADQL query, a list of source IDs, and a path to
a file with XP CONTINUOUS RAW data (csv
, ecsv
, fits
, or
xml
).
import pandas as pd
f = '/path/to/XP_CONTINUOUS_RAW.csv'
df = pd.read_csv(f)
extrema_pwl = find_fast(df)
extrema_pwl.head()
Reading input DataFrame... Done!
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | xp | extrema | |
---|---|---|---|
0 | 5853498713190525696 | BP | 14.874711 |
1 | 5853498713190525696 | BP | 16.885261 |
2 | 5853498713190525696 | BP | 17.522690 |
3 | 5853498713190525696 | BP | 31.734373 |
4 | 5853498713190525696 | BP | 31.944037 |
query_input = "select TOP 2 source_id from gaiadr3.gaia_source where has_xp_continuous = 'True'"
extrema_pwl = find_fast(query_input)
extrema_pwl.head()
INFO: Query finished. [astroquery.utils.tap.core] Running query... Done!
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | xp | extrema | |
---|---|---|---|
0 | 488289544781895936 | BP | 15.714108 |
1 | 488289544781895936 | BP | 34.459567 |
2 | 488289544781895936 | BP | 34.594935 |
3 | 488289544781895936 | BP | 36.994297 |
4 | 488289544781895936 | BP | 38.556855 |
A list of sourceIds can be passed to the fastfinder as the first argument. The converter will then query the Archive for these objects.
sources_list = ['5853498713190525696', 5762406957886626816] # The sourceIds can be string or long.
extrema_pwl = find_fast(sources_list)
extrema_pwl.head()
Running query... Done!
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | xp | extrema | |
---|---|---|---|
0 | 5853498713190525696 | BP | 14.874711 |
1 | 5853498713190525696 | BP | 16.885261 |
2 | 5853498713190525696 | BP | 17.522690 |
3 | 5853498713190525696 | BP | 31.734373 |
4 | 5853498713190525696 | BP | 31.944037 |
f = '/path/to/XP_CONTINUOUS_RAW.fits'
extrema_pwl = find_fast(f)
extrema_pwl.head() # Only the first few rows of the output are displayed when head() is used.
Reading input file... Done!
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.fits
source_id | xp | extrema | |
---|---|---|---|
0 | 5853498713190525696 | BP | 14.874711 |
1 | 5853498713190525696 | BP | 16.885261 |
2 | 5853498713190525696 | BP | 17.522690 |
3 | 5853498713190525696 | BP | 31.734373 |
4 | 5853498713190525696 | BP | 31.944037 |
The fast finder returns all found extrema as a pandas DataFrame. The DataFrame contains pseudowavelengths for extrema in BP and RP for each source, respectively.
This tool looks for all extrema in the internally calibrated mean spectra. It also converts internally calibrated mean spectra from the continuous representation to an externally calibrated sampled form. The converted spectrum is used to provide basic properties of detected extrema.
# Import the tool
from gaiaxpy import find_extrema
The available input types are: a pandas DataFrame, an ADQL query, a list of sourceIds, and a path to a file with XP CONTINUOUS RAW data (csv, ecsv, fits, or xml).
import pandas as pd
f = '/path/to/XP_CONTINUOUS_RAW.csv'
df = pd.read_csv(f) # The values in the DataFrame can be edited if the user wishes to do so.
extrema = find_extrema(df)
extrema.head()
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | bp_330 | 330.020145 | 6.956654e-16 | 5.180523e-18 | 3.970387 | 0.042459 | 8.510702 |
1 | 5853498713190525696 | bp_334 | 334.593994 | 4.164433e-16 | -1.505134e-17 | 3.816551 | 0.185722 | 4.731364 |
2 | 5853498713190525696 | bp_337 | 337.632124 | 2.309730e-16 | -1.001079e-16 | 3.614516 | 1.859752 | 3.284514 |
3 | 5853498713190525696 | bp_341 | 341.739965 | 2.065532e-16 | -2.151889e-17 | 3.701070 | 0.602507 | 3.552140 |
4 | 5853498713190525696 | bp_344 | 344.993479 | 1.072451e-16 | -1.218512e-16 | 3.931419 | 3.756464 | 3.915424 |
query_input = "select TOP 2 source_id from gaiadr3.gaia_source where has_xp_continuous = 'True'"
extrema = find_extrema(query_input)
extrema.head()
INFO: Query finished. [astroquery.utils.tap.core] Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 488289544781895936 | bp_333 | 333.993091 | -3.375660e-18 | 1.406177e-18 | 5.592031 | 0.571797 | 1.328987 |
1 | 488289544781895936 | bp_340 | 340.467525 | 1.582020e-18 | 3.987157e-18 | 7.154992 | 3.665364 | 2.238799 |
2 | 488289544781895936 | bp_348 | 348.374722 | -1.012049e-18 | -1.692863e-18 | 7.203026 | 1.881983 | 2.434976 |
3 | 488289544781895936 | bp_363 | 363.364787 | 1.090649e-18 | 1.395584e-18 | 6.722927 | 1.704171 | 2.134727 |
4 | 488289544781895936 | bp_370 | 370.041668 | -1.289301e-18 | -2.375571e-18 | 7.546684 | 2.635407 | 3.594792 |
A list of sourceIds can be passed to the extremafinder as the first argument. The converter will then query the Archive for these objects.
sources_list = ['5853498713190525696', 5762406957886626816] # The sourceIds can be string or long.
extrema = find_extrema(sources_list)
extrema.head()
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | bp_330 | 330.020145 | 6.956654e-16 | 5.180523e-18 | 3.970387 | 0.042459 | 8.510702 |
1 | 5853498713190525696 | bp_334 | 334.593994 | 4.164433e-16 | -1.505134e-17 | 3.816551 | 0.185722 | 4.731364 |
2 | 5853498713190525696 | bp_337 | 337.632124 | 2.309730e-16 | -1.001079e-16 | 3.614516 | 1.859752 | 3.284514 |
3 | 5853498713190525696 | bp_341 | 341.739965 | 2.065532e-16 | -2.151889e-17 | 3.701070 | 0.602507 | 3.552140 |
4 | 5853498713190525696 | bp_344 | 344.993479 | 1.072451e-16 | -1.218512e-16 | 3.931419 | 3.756464 | 3.915424 |
f = '/path/to/XP_CONTINUOUS_RAW.fits'
extrema = find_extrema(f)
extrema.head()
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.fits
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | bp_330 | 330.020145 | 6.956654e-16 | 5.180523e-18 | 3.970387 | 0.042459 | 8.510702 |
1 | 5853498713190525696 | bp_334 | 334.593994 | 4.164433e-16 | -1.505134e-17 | 3.816551 | 0.185722 | 4.731364 |
2 | 5853498713190525696 | bp_337 | 337.632124 | 2.309730e-16 | -1.001079e-16 | 3.614516 | 1.859752 | 3.284514 |
3 | 5853498713190525696 | bp_341 | 341.739965 | 2.065532e-16 | -2.151889e-17 | 3.701070 | 0.602507 | 3.552140 |
4 | 5853498713190525696 | bp_344 | 344.993479 | 1.072451e-16 | -1.218512e-16 | 3.931419 | 3.756464 | 3.915424 |
All found extrema and their properties are returned as a pandas DataFrame. For each detected line following properties are provided:
By defauft this tool provides a short list of lines (H_alpha, H_beta, HeI).
# Import the tool
from gaiaxpy import find_lines
The available input types are: a pandas DataFrame, an ADQL query, a list of sourceIds, and a path to a file with XP CONTINUOUS RAW data (csv, ecsv, fits, or xml).
import pandas as pd
f = '/path/to/XP_CONTINUOUS_RAW.csv'
df = pd.read_csv(f)
lines = find_lines(df)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | He I_2 | 586.310487 | 1.236627e-15 | -5.690724e-17 | 22.490756 | 4.207830 | 5.056023 |
1 | 5853498713190525696 | H_alpha | 655.968059 | 4.178223e-15 | 1.856800e-15 | 14.492901 | 29.068717 | 58.355642 |
2 | 5853498713190525696 | He I_3 | 704.156775 | 4.863446e-15 | 1.879177e-15 | 14.016893 | 18.544342 | 15.292537 |
3 | 5762406957886626816 | H_beta | 484.420302 | 2.409925e-16 | -7.701987e-17 | 19.829839 | 30.636190 | 52.906350 |
4 | 5762406957886626816 | H_alpha | 655.954918 | 8.142886e-17 | -2.154309e-17 | 14.190875 | 19.575013 | 34.208743 |
5 | 5762406957886626816 | He I_3 | 707.165598 | 7.877543e-17 | 1.357762e-18 | 11.660292 | 1.981938 | 2.918542 |
A list of sourceIds can be passed to the extremafinder as the first argument. The converter will then query the Archive for these objects.
sources_list = ['5853498713190525696', 5762406957886626816] # The source IDs can be either strings or long integers.
lines = find_lines(sources_list)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | He I_2 | 586.310487 | 1.236627e-15 | -5.690724e-17 | 22.490756 | 4.207830 | 5.056023 |
1 | 5853498713190525696 | H_alpha | 655.968059 | 4.178223e-15 | 1.856800e-15 | 14.492901 | 29.068717 | 58.355642 |
2 | 5853498713190525696 | He I_3 | 704.156775 | 4.863446e-15 | 1.879177e-15 | 14.016893 | 18.544342 | 15.292537 |
3 | 5762406957886626816 | H_beta | 484.420302 | 2.409925e-16 | -7.701987e-17 | 19.829839 | 30.636190 | 52.906350 |
4 | 5762406957886626816 | H_alpha | 655.954918 | 8.142886e-17 | -2.154309e-17 | 14.190875 | 19.575013 | 34.208743 |
5 | 5762406957886626816 | He I_3 | 707.165598 | 7.877543e-17 | 1.357762e-18 | 11.660292 | 1.981938 | 2.918542 |
f = '/path/to/XP_CONTINUOUS_RAW.fits'
lines = find_lines(f)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.fits
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | He I_2 | 586.310487 | 1.236627e-15 | -5.690724e-17 | 22.490756 | 4.207830 | 5.056023 |
1 | 5853498713190525696 | H_alpha | 655.968059 | 4.178223e-15 | 1.856800e-15 | 14.492901 | 29.068718 | 58.355641 |
2 | 5853498713190525696 | He I_3 | 704.156775 | 4.863446e-15 | 1.879177e-15 | 14.016893 | 18.544343 | 15.292538 |
3 | 5762406957886626816 | H_beta | 484.420302 | 2.409925e-16 | -7.701987e-17 | 19.829839 | 30.636189 | 52.906350 |
4 | 5762406957886626816 | H_alpha | 655.954918 | 8.142886e-17 | -2.154309e-17 | 14.190875 | 19.575013 | 34.208743 |
5 | 5762406957886626816 | He I_3 | 707.165598 | 7.877543e-17 | 1.357762e-18 | 11.660292 | 1.981938 | 2.918542 |
All found lines and their properties are returned as a pandas DataFrame. For each detected line following properties are provided:
It is possible to change a source type to QSO and provide redshift(s). By defauft this tool provides a short list of lines (H_alpha, H_beta, CIV, CIII], MgII, Ly_alpha).
# A list of sourceIds
sources_list = [858200268935710208, 3937755935439564672]
# A list with redshifts
zets = [(858200268935710208, 0.06107), (3937755935439564672, 2.698)]
lines = find_lines(sources_list, source_type='qso', redshift=zets)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 858200268935710208 | H_beta | 519.826376 | 3.409607e-17 | 1.465183e-17 | 26.649797 | 13.423813 | 25.802867 |
1 | 858200268935710208 | H_alpha | 696.920068 | 6.136522e-17 | 5.011845e-17 | 16.417239 | 38.716931 | 66.964718 |
2 | 3937755935439564672 | Ly_alpha | 453.975163 | 2.954658e-18 | 1.917204e-18 | 16.403438 | 5.230892 | 12.308939 |
3 | 3937755935439564672 | C IV | 570.721363 | 1.114978e-18 | 5.137242e-19 | 26.334381 | 5.040652 | 6.930669 |
4 | 3937755935439564672 | C III] | 710.276640 | 7.914149e-19 | 3.288256e-19 | 11.389798 | 2.849785 | 4.142953 |
Additional arguments can be passed to the tools.
These are:
The source mean BP/RP spectrum is described as a combination of basis functions. Particularly for faint sources or sources with a low number of observations, it is useful to represent the spectrum using a smaller set of basis functions to avoid higher-order bases fitting the noise in the observed data.
The truncation parameter is a boolean which toggles the truncation of the set of bases.
# A list of sourceIds
sources_list = [5853498713190525696, 5762406957886626816]
extrema_pwl_tr = find_fast(sources_list, truncation=True)
extrema_pwl_tr.head()
Running query... Done!
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | xp | extrema | |
---|---|---|---|
0 | 5853498713190525696 | BP | 14.874719 |
1 | 5853498713190525696 | BP | 16.885232 |
2 | 5853498713190525696 | BP | 17.522827 |
3 | 5853498713190525696 | BP | 31.731594 |
4 | 5853498713190525696 | BP | 31.946459 |
An independent set of lines can be provided in a form of a list or a file. The lines can be provided to the parameter user_lines. This option is available for the linefinder tool only.
# A list of sourceIds
sources_list = [5853498713190525696, 5762406957886626816]
# A list with user's lines: wavelengths [nm] and names
new_lines = [(434.0472, 410.1734), ('H_gamma', 'H_delta')]
lines = find_lines(sources_list, user_lines=new_lines)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5762406957886626816 | H_gamma | 431.162668 | 3.534201e-16 | -7.068856e-17 | 13.200787 | 12.131214 | 33.215733 |
# Path to file with lines
f = '/path/to/lines_example.txt'
lines = find_lines(sources_list, user_lines=f)
lines
Preparing required internal data...
0/2 [00:00<?, ?spec/s]
Done! Output saved to path: ./output_lines.csv
source_id | line_name | wavelength_nm | line_flux | depth | width | significance | sig_pwl | |
---|---|---|---|---|---|---|---|---|
0 | 5853498713190525696 | H_alpha | 655.968059 | 4.178223e-15 | 1.856800e-15 | 14.492901 | 29.068717 | 58.355642 |
1 | 5762406957886626816 | H_beta | 484.420302 | 2.409925e-16 | -7.701987e-17 | 19.829839 | 30.636190 | 52.906350 |
2 | 5762406957886626816 | H_alpha | 655.954918 | 8.142886e-17 | -2.154309e-17 | 14.190875 | 19.575013 | 34.208743 |
Note: The file should contain two columns separated by spaces: the first for wavelengths [nm] and the second for line names, with no header.
E.g.:
656.461 H_alpha
486.268 H_beta
It is possible to plot spectra with marked detected extrema or lines by using a boolean parameter plot_spectra. This option is available for the linefinder and extremafinder tools only.
lines = find_lines([5762406957886626816], plot_spectra=True)
Preparing required internal data...
Finding lines: 0%| | 0/1 [00:00<?, ?spec/s]
01<00:00, 1.28s/spec]
Done! Output saved to path: ./output_lines.csv
The additional boolean parameter save_plots is a boolean that tells the program whether to save the plots.
lines = find_lines([5853498713190525696], plot_spectra=True, save_plots=True)
Preparing required internal data...
02<00:00, 2.85s/spec]
Done! Output saved to path: ./output_lines.csv
Three parameters: output_path, output_file, and output_format define the entire path of the resulting file.
The default output path is the current path. If the given output path does not exist, it will be created.
The default output file name is 'output_lines'.
The default output format is the same as the format of the input file (i.e. if the input file format is 'fits', then by default, the output file will be a FITS file.), or CSV in any other case (DataFrame, ADQL query or list).
lines = find_lines([5853498713190525696], output_path='.', output_file='my_file', output_format='ecsv')
Preparing required internal data...
0/1 [00:00<?, ?spec/s]
Done! Output saved to path: ./my_file.ecsv
The additional parameter save_file is a boolean that tells the program whether to
save the results or not. If output_file
is given but save_file
is set to
False, a warning will be raised.
lines = find_lines([5853498713190525696], output_path='.', output_file='my_file', output_format='ecsv', save_file=False)
Preparing required internal data...
0/1 [00:00<?, ?spec/s]